A Histogram Utilizing the Cluster Information
نویسندگان
چکیده
Histograms are summary structures of large datasets, which are mainly used for selectivity estimation during query optimization. Selectivity estimation is the fast approximation of query result size. In this paper, we focus on multi-dimensional histograms, especially bidimensional histograms. At the time of selectivity estimation, buckets partially overlapping with a query return approximated results assuming that all objects within them are uniformly distributed. Since, however, the objects within the region of a query are not likely to be uniformly distributed, skews (or clusters) in buckets commonly degrades the accuracy of a histogram. Our aim is to utilize clusters in buckets to enhance the accuracy of selectivity estimation. We propose a new method that associates cluster information with a bucket. We present new schemes which define clusters formally and algorithms which find such clusters efficiently as well. We show through experiments that our proposed method provides better performance than other existing wellknown methods.
منابع مشابه
Two-dimensional histogram equalization and contrast enhancement
Proposed method – Two dimensional histogram equalization(2DHE) • Utilizing contextual information to enhance contrast • Based on contrast observation in image − Improved by increasing gray-level • Including global histogram algorithm − Special case of 2DHE • Automatic parameter selection algorithm contained
متن کاملWasserstein k-means++ for Cloud Regime Histogram Clustering
Much work has sought to discern the different types of cloud regimes, typically via Euclidean k-means clustering of histograms. However, these methods ignore the underlying similarity structure of cloud types. Wasserstein k-means clustering is a promising candidate for utilizing this structure during clustering, but existing algorithms do not scale well and lack the quality guarantees of the Eu...
متن کاملLocal vs. Global Histogram-Based Color Image Clustering
In this paper, we present two image clustering techniques to automatically group color images that correlate with semantic concepts. This work goes towards satisfying the ever growing need for techniques that are capable of automatically generating semantic concepts for images from their visual features. We present two techniques and evaluate their relative performances based on the perceptual ...
متن کاملAdding spatial distribution clue to aggregated vector in image retrieval
This study proposes a novel algorithm that enhances the distinctiveness of the traditional vector of locally aggregated descriptors (VLAD) using spatial distribution clue of local features. The algorithm introduces a new method to compute the spatial distribution entropy (SDE) of clusters. Unlike conventional methods, this algorithm considers the distribution of full spatial information provide...
متن کاملIris Feature Extraction and Recognition using Unbalanced Haar Wavelets & Modified Multi Texton Histogram
Colored disk in the eye, the iris, attracted biometric Technologies to create potential and robust identification and verification systems designed for human identification in a no. of applications. Many techniques have been developed for iris recognition so far. Here, a new iris recognition system utilizing unbalanced wavelet coefficients and modified multi texton histogram feature coefficient...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005